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STEIN'S METHOD, PALM THEORY AND POISSON PROCESS 

APPROXIMATION 1 

By Louis H. Y. Chen and Aihua Xia 

National University of Singapore and University of Melbourne 

The framework of Stein's method for Poisson process approxi- 
mation is presented from the point of view of Palm theory, which is 
used to construct Stein identities and define local dependence. A gen- 
eral result (Theorem 2.3) in Poisson process approximation is proved 
by taking the local approach. It is obtained without reference to any 
particular metric, thereby allowing wider applicability. A Wasserstein 
pseudometric is introduced for measuring the accuracy of point pro- 
cess approximation. The pseudometric provides a generalization of 
many metrics used so far, including the total variation distance for 
random variables and the Wasserstein metric for processes as in Bar- 
bour and Brown [Stochastic Process. Appl. 43 (1992) 9-31]. Also, 
through the pseudometric, approximation for certain point processes 
on a given carrier space is carried out by lifting it to one on a larger 
space, extending an idea of Arratia, Goldstein and Gordon [Statist. 
Set. 5 (1990) 403-434]. The error bound in the general result is sim- 
ilar in form to that for Poisson approximation. As it yields the Stein 
factor 1/A as in Poisson approximation, it provides good approxi- 
mation, particularly in cases where A is large. The general result is 
applied to a number of problems including Poisson process modeling 
of rare words in a DNA sequence. 

1. Introduction. Poisson approximation was developed by Chen (1975) 
as a discrete version of Stein's normal approximation (1972). It involves the 
solution of a first-order difference equation, which we call a Stein equation. 
In extending Poisson approximation to higher dimensions and to Poisson 
process approximation, Barbour (1988) converted the first-order difference 
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equation into a second-order difference equation and solved it in terms of an 
immigration-death process. This work was further extended by Barbour and 
Brown (1992), who introduced a Wasserstein metric on point processes and 
initiated a program to obtain error bounds of similar order to that on the 
total variation distance in Poisson approximation. This has been achieved 
for some special cases by Xia (1997, 2000), and a general result with error 
bounds of the desired order has been obtained by Brown, Weinberg and 
Xia (2000). 

In this paper, another general result on Poisson process approximation is 
proved by taking the local approach. It is obtained without reference to any 
particular metric, thereby allowing wider applicability. In proving this result, 
the framework of Stein's method is first presented from the point of view 
of Palm theory, which is used to construct Stein identities and define local 
dependence. Although the connection between Stein's method and Palm 
theory has been known to others [see, e.g., Barbour and Mansson (2002)], 
little of it has been exploited. 

In applying the general result, a Wasserstein pseudometric is introduced 
for measuring the accuracy of point process approximation. The pseudo- 
metric provides a generalization of many metrics used so far, including the 
total variation distance for random variables and the Wasserstein metric for 
processes as in Barbour and Brown (1992). Also, through the pseudometric, 
approximation for certain point processes on a given carrier space is carried 
out by lifting it to one on a larger space, extending an idea of Arratia, Gold- 
stein and Gordon [(1990), Section 3.1], which was refined by Chen [(1998), 
Section 5]. 

The error bound in the general result is similar in form to that for Poisson 
approximation [see, e.g., Arratia, Goldstein and Gordon (1989), Theorem 
1]. It is simpler and easier to apply than that in Brown, Weinberg and 
Xia (2000). As it yields the Stein factor 1/A as in Poisson approximation, it 
provides good approximation, particularly in cases where A is large. 

The general result is applied to prove approximation theorems for Matern 
hard-core processes and for marked dependent trials. The latter is in turn 
applied to the classical occupancy problem and rare words in biomolecular 
sequences. The last application, in fact this paper, is motivated by an in- 
terest in modeling the distribution of palindromes in a herpesvirus genome 
by a Poisson process. In Leung, Choi, Xia and Chen (2002), the Poisson 
process model is used to provide a mathematical basis for using r-scans in 
determining nonrandom clusters of palindromes in herpesvirus genomes [see 
also Leung and Yamashita (1999)]. 

2. From Palm theory to Stein's method. Let T be a fixed locally compact 
second countable Hausdorff topological space. Such a space is also a Polish 
space, that is, a space for which there exists a separable and complete metric 
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in r which generates the topology. Define Ti to be the space of nonnegative 
integer-valued locally finite measures on T, and let B be the smallest a- 
algebra in Ti making the mappings £ i— > £(C) measurable for all relatively 
compact Borel sets CcT. Recall that a point process on T is a measurable 
mapping of some fixed probability space into (Ti, B) [Kallenberg (1983), page 
5]. For a point process S on T with locally finite mean measure A, the point 
process E a is said to be a Palm process associated with S at a G T if , for 
any measurable function /:TxW-> R + := [0, oo), 

(2.1) EQf /(a,H)E(da)) =e(J f(a,E a )\(da] 
[Kallenberg (1983), Chapter 10]. Intuitively, 

P(^ Q € S) = Eg((ia) for all 5 € B. 

An important characterization of Poisson process in the language of Palm 
theory is that H is a Poisson process if and only if £(S a ) = £(E + <5 a ) A- 
a.s., where 5 a is the Dirac measure at a. This highlights an idea of Poisson 
process approximation: if we define 

Df(0 0Z(dx) -J T f(x,d + 5 x )X(dx), 

then jC(S) is close to the Poisson process distribution over T with mean 
measure A, denoted as Po(A), in terms of a certain metric if, for the set of 
suitable corresponding test functions / : T x TL — > R := (— oo, oo), 

(2.2) ED/(S)«0. 

In other words, for a function g : Ti — > M, if we can find a solution / s to the 
equation 

(2.3) g(0-Po(X)(g) = Df(0, 

then the distance between the distribution of E and Po(A) is achieved by 
the supremum of |ED/ 9 (E)| over the class of g which defines the metric. 
Equation (2.3) is known as a Stein equation. If there exists a function h : Ti — > 
R such that /(x,£) = h(£ - 5 X ) - h(Q, then 

Df(o =jut + Sx) - m]Mdx) + Jm - s*) - kowx) ■.= ako- 

It is known that A is the generator of an H-valued immigration-death pro- 
cess Z^(t) with immigration intensity A and unit per capita death rate, 
where Z|(0) = £. This fact was noted by Barbour (1988), who developed a 
probabilistic approach to Stein's method for multivariate Poisson and Pois- 
son process approximations. The equilibrium distribution of Zg is a Pois- 
son process with mean measure A. The idea of introducing a Markov point 
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process is to exploit the probabilistic properties of the Markov process for 
obtaining bounds on the metrics of interest [see Barbour and Brown (1992) 
and Brown and Xia (2000)]. 

For £ € TL and a Borel set B CT, we define £|b as the restriction of £ to 
B, that is, £|b(C) =£(BC\C) for Borel sets C C T. Let 3 be a point process 
on r with Palm processes Assume that for each a there is a Borel set 

A a C T such that a € A a and the mapping 

(2.4) V xH^T xH:{a,0^(a,£ {a) ) 

is product measurable, where := £\a c ■ Note that does not refer to 
the Palm measure. As the measurability of (2.4) is often hard to check, we 
give a sufficient condition for (2.4) to hold: A = {(x,y) :y £ A x ,x € T} is 
a measurable set of the product space T 2 :=T x F. We give a brief proof 
for the sufficiency. By the monotone class theorem, it suffices to show that 
the mapping M^(a,£) := (a,^ a ^) is measurable for rectangular sets A = 
B\ x B2, where B\ and B2 are measurable subsets of T. Indeed, 



(x,£\b§), ifxG^i, 
(x,£), if x(£B 1 , 



M BiX b 2 (x,£) ■ 
is measurable. 

The requirement of A being measurable in T 2 is almost necessary. To 
see this, let T = [0, 1], A = B\ x B2, where B\ C T is not Borel measurable 
[Nielsen (1997), page 128, 9.16(h)] and B 2 C T is a Borel set. Define C = 

: £(£ 2 ) ^ 0}; then M^ l {T xC) = B{xC is not a measurable set of V x H. 

Remark 2.1. In Barbour and Brown [(1992), page 15], it is proved that 
if A a is a ball of fixed radius, then the mapping in (2.4) is measurable. 

We define 5 to be locally dependent with neighborhoods (A a ;a 6 T) if 
£((Z a )^)=£(Z^), A-a.s. 

Lemma 2.2. The following statements are equivalent: 

(a) Ej r f(a,E^ + 5 a )E(da) =Ej r f(a,E^ + 5 a )\{da) for all measur- 
able f:TxH^M + . 

(b) £((S a )W) =£(»M), A-a. S . 

Proof. By the definition of Palm process, we have 

(2.5) E J f(a, + 5 a )E(da) =M J f(a, (H a )W + 5 a )X(da). 

Hence, (b) implies (a). Now assume (a). With the vague topology, TC is a 
Polish space [sec Kallenberg (1983), page 95], so there exists a sequence 
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of bounded uniformly continuous functions (fj',j > 1) on Tt which form a 
determining class [Billingsley (1968), page 15]: for every two probability 
measures Qi and Q2 on Ti, if / fj dQ± = J fj (IQ2 for all j > 1, then Qi = Q2 
[see Parthasarathy (1967), Theorem 6.6]. By taking f(a,£ + S a ) = k(a)fj(£), 
it follows from (2.5) that 

J r k(a)[Ef 3 (E^)]X(da) = ^k(a)[Efj((E a )^)]X(da) 

for all bounded measurable functions k : T — > IR + and fj. Fixing fj and allow- 
ing k to vary, we have Efj(E^) = Efj((E a )^), A-a.s. Now vary fj and (b) 
follows. 

□ 

In general, a point process is not necessarily locally dependent, but Lemma 2.2 
suggests that, in a loose sense, 

(2.6) £((=a) (a) )«£(H (a) ), A-a.s. 
if and only if 

E f f(a,E^ + 5 a )E(da) 

(2.7) r 

« E J f(a, ~ {a) + 5 a )X(da) for suitable /. 

This will be our guiding principle in proving Theorem 2.3 using the local 
approach, as follows [an extension of the approach of Chen (1975) which was 
elaborated by Barbour and Brown (1992)]: 

Ejf /(a,E)3(da) 

= E j£[/(a,3) - /(a, 3^ + 5 a )]E(da) 
+ E^ /(a, 3^ + «J a )[H(da) - A(da)] 
+ E jf [/(a, 3<«> + <5 a ) - /(a, 3 + <5 a )]A(da) 
+ E^f(a,E + 5 a )X(da), 

which implies 



EDf(E) = E J [f(a, E) - f(a, + 5 a )]E(da) 
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(2.8) + E J f(a, ~ {a) + S a )[E(da) - \(da)} 

+ E J[f(a, E<«> + 6 a ) - f(a, E + 5 a )]X(da). 

Hence, a bound on MDf g (3) can be obtained by bounding the right-hand 
side of (2.8). 

There are two ways to handle the second term in (2.8): one uses coupling 
and the other involves Janossy densities [Janossy (1950) and Daley and 
Vere-Jones (1988)]. For a finite point process 3, that is P(|H| < oo) = 1, there 
exist measures (J n )n>i such that, for measurable functions f :T~L — 



E/(s)=£/ r i/(xx)*. 

n>0 \i=l / 



(dxi,...,dx n ). 



The term (n!) -1 J n (dxi, . . . ,dx n ) can be intuitively explained as the proba- 
bility of S having n points and these points being located near (sci, . . . ,x n ). 
The measures (J n )n>i are called Janossy measures by Srinivasan (1969). 

Suppose there is a reference measure i/onT such that, for each n > 1, J n 
is absolutely continuous with respect to u n . Then, by the Radon-Nikodym 
theorem, the derivatives j n of J n with respect to v n exist, so that 

E /( H ) = H / —J \ ^2 s xi )jn(xi,...,x n )u n (dx 1 ,...,dx n ). 

ri>0 JTn U - \i=l / 

The derivatives (j n )n>i are called Janossy densities. 

The density of the mean measure A of a finite point process H with respect 
to v can be expressed by its Janossy densities (j n )n>i as 



( f>( x )=Y] [ ^-;j m+ i(x,xi,...,x m )u m (dxi,..., 



dx r 



where the term with m = is interpreted as j\{x) [Daley and Vere-Jones (1988), 
page 133]. 

When the point process is simple, the Janossy densities can also be used 
to describe the conditional probability density of a point being at a, given 
the configuration of H outside A a . More precisely, let m £ N be fixed 
and (3 = (f3i,...,(3 m ) G (A c a ) m , and define 

(9Q\ G( R\ - ^>^^r a jm+r+i{a,P,l){r^~ l ^ r {dl) 

where the term with r = is interpreted as j m +i(a,/3) and the term with 
s = as j m (j3). Then Q(a, (3) is the conditional density of a point being near 
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a given that E^ is J2iLi <W Direct verification gives that, for any bounded 
measurable function / over 7i, 

(2.10) EQf /(a,E< a >)H(da)) = e(J f(a,S^)g(a,E^)u(dc 
For each / :T x Ti. —* R + , £ € 7Y, write £(Ac) = m and define 



(5f)(x,£)= sup max 
{ Zl ,...,z m }cr°<J< m - 1 



i=l 

i+i 



where the right-hand side is interpreted as if m = 0. Combining (2.3) and 
(2.8) gives: 

Theorem 2.3. For each bounded measurable function g:TC — >-R+, 
|Es(H)-Po(A)G7)| 

(2.11) <E f (5f g )(a,E)(E(A a )-l)E(da)+mm{£ 1 (g,E),e 2 (g,E)} 



+ E / (Sf g )(a,E)X(da)E(A a ), 



where 



(2.12) e 1 (g,E)=E \f g (a,E^ + S a )\\G(a,E^) - <f>(a)\v(da), 

JaeT 

which is valid ifE is a simple point process, and 

(2.13) e 2 (g,E)=E[ \f g (a, E^ + 8 a ) - f g (a, (E a )^ + S a )\ X(da). 

Remark 2.4. How judicious (A a ;a € T) are chosen is reflected in the 
upper bound in (2.11), and (2.13) suggests that (A a ; a £ T) should normally 
be chosen such that (2.6) holds. 

3. Poisson process approximation in Wasserstein pseudometric. We now 

look at special test functions g which define metrics of our interest. We begin 
with a pseudometric po on T bounded by 1 [cf. Barbour and Brown (1992)]. 
In order for Theorem 2.3 to be applicable, we assume that the topology 
generated by pq is weaker than the given topology of T. Let K, stand for the 



8 L. H. Y. CHEN AND A. XIA 

set of po-Lipschitz functions k : T — > [—1, 1] such that \k(a) — k(/3)\ < po(a, (5) 
for all a,f3 € T. The first Wasserstein pseudometric p\ is defined on 7i by 



Pi (6, 6) 




if I6I7H6I, 
if 161 = 161 >o, 



where |6| is the total mass of 6- A pseudometric p'{ equivalent to p\ can 
be defined as follows [cf. Brown and Xia (1995)]: for two configurations 
6 = E?=l <5 W and 6 = i with m > n, 

n 

Pi (6 , 6) = min ^ po , ^r(i) ) + (m - n) , 

i=l 

where tt ranges all permutations of (1, . . . , m). 

Let .F denote the set of pi-Lipschitz functions on 7i such that 1/(6) ~~ 
7(6)1 < Pi (6> 6) f° r ail 6 and £2 €E 7Y. The second Wasserstein pseudomet- 
ric is defined on probability measures on Ti with respect to p\ by 



P2(Qi,Q2) = sup 



J fdQi- J fdQ, 



The use of a pseudometric po provides not only generality but also wider 
applicability. For example, if we choose po(x,y) = 0, then 

P2(Qi,Q2) = d TY (C(\x 1 \),/:(\x 2 \)), 

the total variation distance between £(|Xi|) and £(|X2|), where AQ has 
distribution Qj, i = 1,2. It is known that, for g = 1b with B C Z + : = 
{0,1,2,...}, 

1 - e~ x , „ , FY 



(Sfg)(x,0<^—, l/ fl l<lA y 

where, and throughout this paper, A is the total mass of A and is assumed to 
be finite [see Barbour, Hoist and Janson (1992) and Brown and Xia (2001)]. 
So Theorem 2.3 gives: 

Theorem 3.1. We have 

d TV (£(~(T)),Po(\)) <^—E f {E(A a )-l)E{da) + mm{e 1 ,e 2 } 

A Jaer 

1 - er x f 
+ : / \(A a )X(da), 

A Jagr 

where 

£1 = 1 A a/4-/ E\g(a,E^)-0(a)\v(da), 
V eA J aG r 
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which is valid for S simple, and 

S2 = — ^— [ E\\Z {a) \-m a ) {a) \\\(da). 

A Ja&Y 

Theorem 3.1 with £\ is a generalization of Chen (1975) [see also Barbour 
and Brown (1992)] and with e 2 allows the use of the coupling approach [see 
Barbour and Brown (1992)]. 

Another example is in Section 4, where it is possible to introduce an index 
space so that the results also include the approximation in distribution by a 
Poisson process to discrete sums of the form J27=i -^i^Yi j where is a ran- 
dom mark associated with Xi, as in Arratia, Goldstein and Gordon (1989). 

We now establish a general statement of this section. As the arguments 
in Barbour and Brown (1992) and Brown and Xia (2001) never rely on the 
property that po(x,y) = implies x = y, the results are still valid for po and 
the pseudometrics pi and p2 generated from pq. The following two lemmas 
are taken from Barbour and Brown (1992) and Brown and Xia (2001). 

Lemma 3.2. For each pi-Lipschitz function g G J- ', x,y £ T and £ € TL 
with |£| = n, the solution f g of (2.3) satisfies 

(3.1) \fg(x, £+6 x + 5y) ~ fg(x,£ + S x )\ <J + 

(3.2) \f g (y^ + 5 y )\<lAl.65\-^ 2 . 

Lemma 3.3. For each g £ J 7 , £, r\ € TL and x G V , 
\f g (x,£ + S x ) - f g (x,i] + 8 x )\ 

s wkri^ - ""i - if in + (1 + m^rd M - m 

With the above two lemmas, we write another version of Theorem 2.3. 

Theorem 3.4. We have 
p 2 (CE,Po(\)) 

(3.3) < E jf ^ (J + J ) (E(A a ) - l)H(da) + min{e 1 ,e 2 } 

+ E / / (f + \(- 1m ) X (da)\(d(3), 
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where 

(3.4) ei = (lA(1.65A~ 1/2 )) / E\G(a, H (q) ) - <p{a)\u(da), 

(3 ' 5) 62 = E L(I + aV>| + 1 >^>'°'- s " , '> A <' b >' 

In many applications, we can obtain the Stein factor 1/A from the terms 
(|SW| + I)" 1 , (|(H a )W| A |SW| + I)" 1 , (|(S^)W| + I)" 1 , 
by applying Lemma 3.5. 

Lemma 3.5 [Brown, Weinberg and Xia (2000), Lemma 3.1]. For a ran- 
dom variable X > 1, 

/ 1_\ VK(l + At/4) + l + K/2 

^\XJ ~ E(X) 
where n = Vax(X)/E(X). 

Corollary 3.6. If S is a locally dependent point process with neigh- 
borhoods (A a ;a£T), then 

p 2 (£E, Po(A)) < E jf (f + ^J\ + l ) ( E ( A *) ~ ^ E ( da ) 

(3.6) 

where ^=£\ A c anA c. 

Remark 3.7. Since 

/ Aitd 1 5 ( da ) = / / rfa) 1 . 7 i 
J ae r I + 1 Jaer J(3eA a \r,( a > \ + 1 

^ / / r( J,^ S(d/3)S(da), 

to simplify the first term of (3.6) using the assumption of local dependence, 
it is tempting to ask whether 

E -j^— E(dp)E(da)=E —^ - - EZ(d(3)~(da). 

The answer is generally negative, although it might be true in many appli- 
cations, as shown in Section 5. To see this, let P(-Bj) = q = 0.1 for i = 1, 2, 3, 
F(BiBj) = q 2 for 1 < i ^ j < 3 and P(Si J B 2 S 3 ) = 2q 3 . Set T = {1,2,3}, 



POISSON PROCESS APPROXIMATION 



11 



E({i}) = t Bi , l<i<3, and A 1 = A 2 = {1,2} and A 3 = {1,3}; then S is 
locally dependent with neighborhoods (Af,i E T). However, direct calcula- 
tion gives 



and 



so 



E ^})TT Ei{im{2})=q2 - q3 



E H({3}) + 1 E5({1})5({2}) = (1 " - 5q)q2 ' 



E^4^ ~({1})~({2}) ^E— ^ E~({1})~({2}). 

E({3}) + 1 u /; u S>T ~({3}) + l u /; u /; 

4. Sums of marked dependent trials. The case of Poisson process ap- 
proximation for sums of marked dependent trials is of particular interest as 
it has applications in computational biology, occupancy and random graphs. 
We devote this section to this case. 

Let ij, i E X, be dependent indicators with X a finite or infinitely countable 
index space and 

P(J i = l) = l-P(/ i = 0)=jJi, iel. 

Let Ui, i £ X, be S- valued independent random elements, where S is a lo- 
cally compact second countable Hausdorff space with metric do bounded 
by 1 . Assume that {Ui , i G X} is independent of {X , i S X} . Our interest 
is to approximate the distribution of M. := J2ieiIi$Ki by that of a Poisson 
process. 

Let 7~t(S) be the space of nonnegative integer- valued locally finite mea- 
sures on S. The metric do will generate the first Wasserstein metric d\ on 
T~i(S) and second Wasserstein metric d 2 on probability measures on 7~t(S) as 
in Section 3 [see also Barbour and Brown (1992)]. For each i G X, let Ai C X 
such that i £ A{. Let m = C(Ui), the law of i/j, i GX, and let A = YieiPH^i- 
Define V, V rV . , /,-. 



Theorem 4.1. We have \ = Y J ieiPi and 

5 

A + Vi + 1 



d 2 (CM,Po(\))<Kj2 E (| + T T^-)^i + min{e 1 , e2 } 



*eTjeAi\{i} 

(4.1) 

i£lj£A V 
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where 

e x = (1 A 1.65X- 1 / 2 ) £Ai)-pi\, 

and (Jji',j G X) and (/,•; j G X) are defined on the same probability space with 
C(Jji;j el) = C(Ifj el\Ii = l). 

Remark 4.2. The bound in (4.1) does not depend on the distribution 
of the marks (Ui)i^j, since the mean measure of the approximating Poisson 
process has been chosen to reflect the contribution of the marks. 

Remark 4.3. Since Ai is in general not a simple point process, the 
Janossy density approach via (2.9) is not applicable. Also, due to the struc- 
ture of A4, the neighborhoods {A a ,a G S} cannot be determined. By intro- 
ducing a pseudometric and by lifting the process M from S to a larger carrier 
space r = S x X, the lifted process becomes simple and the neighborhoods 
{A a ,a G T} determinable. 

Proof of Theorem 4.1. We consider the approximation on the lifted 
space r = S x X with pseudometric po((s,i), (t,j)) = do(s,t). For each £j G 
H(T) (l means lifted), define £ G W(S) by £(ds) = E ieX 6(ds, {*})• Let Mi(ds, {£}) = 
Ii&Ui{ds) and let Xi(ds,{i}) = pi/ii(ds). Then Mi is a simple point process 
on r, M(ds) =Y,i^jMi(ds,{i}), \{ds) = Az(ds, {«}), and 

p 2 (£M, Po(A,)) = d 2 (jCM , Po( A) ) . 

For each (s,i) G T, define A (M) := 5 x A*. Then \M^ S '^\ = V*. 
The first term in the upper bound of (3.3) becomes 

B jUr(! + T4TT) (M( ^ ) - l)/i *' ( * ) 

which gives the first term of the bound (4.1). Referring to (3.4), if we take the 
reference measure v(ds, {i}) = fj,i(ds), then cp((s,i)) =pi and for i\, . . . G 
X, where ii, . . . , i& are all different, 

jfc((si,«i), • • • , (s k ,ik)) = JP(C hr .. tik ), 
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where 

Ch,...,i k ■= {II = 1 for I = h,.- -,ik and Ii = for I ^ i u . . .,i k }. 

For a = (s,i), f3 = ((si,h), . . . , (s k ,i k )) e (A c ^ s ^) k , the numerator of (2.9) 
becomes 

E m 

r>0 " {ji,...,jr}cAi\{i} 

= F(Ij = 1 for j = . . . ,i k and Ij = for j G A\ \ {ii, . . . , i fc }); 
and the denominator of (2.9) is reduced to 

E^ E m>,. 

r>0 " {ji,...,j r }cAi 



Ij = 1 for j = i x , . . . , i k and Ij = for j G A\ \ {it, . . . , i k }). 
It follows that 
G((s,i),((si,ii), . . . ,(s k ,i k ))) 

= F(Ii = l\Ij = 1 for j = i 1 ,...,i k and Ij = for jeAf\{i u .. .,i k }). 
Therefore, 

g(( 8 ,i),M\ M) )=m\ij-JtAi) 

and 

/ E\g(( S ,i)Mi {s ' t)) )-4>((s,i))nd S ,{i})=Y / nnm^^A i ) -ni, 

which gives £\ of Theorem 4.1. On the other hand, in view of £2 in (3.5), we 
can write the Palm process associated with Mi at (s,i) as 

( Jji5 Uj {dt), ifj^i, 
M( S! i)(dt, {j}) = | 0, if j = i and t ^ s, 

{_ 5t(dt), if j = i and t = s. 

With this coupling, we have K-M^))"^ | = EjfA t J ji and \M\ M) \ = VJ. 
So, 



l(.M( s ,i)) (M) l A l^i I + 1 Vi J * + 1 

and 

^((•MM) (M) ,^! M) )<El J ii- J il' 
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which yields £2 of Theorem 4.1. Finally, since 

\(ds, {i})X(dt, {j}) =piPjHi(ds)fj,j(dt), 
the last term of (4.1) follows from the last term of (3.3). □ 

Bounds on E[yq-j-|7j = 1] and E[yq-j-|/j = ij = 1] may be obtained by 
applying Lemma 3.5. Sharper bounds can be achieved if additional infor- 
mation about the relationship of I^s is available, for example, if Ij's are 
independent. 

Remark 4.4. If Ii, i € X, are locally dependent with neighborhoods (Af, 
i € X) , then 



E 



1 



Vi + 1 



<E 



1 



V ij + 1 



where = Efc^uA,- T k- 



Random indicators (Ij; iGl) are said to be negatively related (resp. pos- 
itively related) if, for each i, (Jji, j £ X) can be constructed in such a way 
that Jji < (resp. >) Ij for j € X, j 7^ i [see Barbour, Hoist and Janson (1992), 
page 24]. 

Proposition 4.5. Suppose (Ij]j EX) are negatively related, and let A = 
E Ei G x^/ then 



E 



1 



< 



1 



A 



PROOF. Indeed, since (Ij]j £ X) are negatively related, for decreasing 
function 



V ViGT\{;} / 



X = l >E $ 



*GX\{j} 



/ i = 



so for fixed < z < 1, E(z^ ieI ^-'> 1 [/.,■) is increasing in and is a de- 
creasing function in Ij, giving 

Ez£*ei /l = E[E(z^ x \^ Ii \I j )z Ij ] 

< E[E(z^^\^ Il \Ij)]K[z^] = E(z^^\i^ h )Ez I \ 

[see Liggett (1985), page 78]. Since X is a finite or infinitely countable index 
set, by mathematical induction, 

Ez^iez 1 * <Y[Ez u . 

i&X 
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E— \ -=E/ ^iei h dz<l TT(1 -Pi +Piz)dz 

h + 1 Jo Jo Z-t 



Hence 

< / 1 TTe" p,(1 " z) ^ 
^ tlx □ 

Corollary 4.6. VF^/i the same setup as in Theorem 4.1, suppose £ 
I) are negatively related; then 



l-e~ x 



(4.2) d 2 0CM,Po(A))<E]r(^ + 



P 2 i +PiJ2^j ~ J n 



Proof. By Theorem 4.1 with A { = {i} and e 2 , the first term of (4.1) 
vanishes and the last two terms of (4.1) can be rewritten as (4.2). □ 

As we need to bound E[(V^ + 1)" 1 \h = Ij = 1] , it is relevant to ask whether 
(Jki,k £ I) are also negatively (resp. positively) related if £ I) are 
negatively (resp. positively) related. The answer is generally negative, as 
the following counterexample shows. 

Counterexample 4.7. Choose four sets Bi, 1 < i < 4, so that P(Bi) = q, 
P(BiBj) = bq 2 , F(BiBjB k ) = bq 3 , for all different l<i,j,k< 4; and P(5iS 2j B 3 S 4 ) 
bq A with b < 2 and q sufficiently small (e.g., < 0.01) so that the sets are prop- 
erly defined. Set = 1^. Then for any increasing function $ on {0, l} 3 [see 
Barbour, Hoist and Janson (1992), page 27], we have 

E [$ (h , I 2 , h) | h = 1] - E$ ( Ji , I 2 , h) 

= q(b - 1)[$(1,0,0) + $(0, 1,0) + $(0,0, 1) - 3^(0,0,0)]. 

Hence, by Theorem 2.D of Barbour, Hoist and Janson (1992), if we choose 
b > (resp. <) 1, then (1^; 1 < i < 4) are positively (resp. negatively) related. 
But 



P(J3l = J4i = 1| J21 = 1) = P(/ 3 = h = l\h =h = l)=q 2 
P(J 31 = ^41 = 1) =P(/ 3 =h = l\h = 1) = feg 2 , 

P(J 3 i = ^41 = 1| J21 = 1) < (resp. >) P(J 3 i = J41 = 1), 

which implies that (Jki,k= 1,...,4) are not positively (resp. negatively) 
related. 



and 



so 
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5. Applications. In this section, we apply the main results in Sections 3 and 4 
to the Matern hard-core process, an occupancy problem and rare words in 
DNA sequences, all of which are different in nature. The results in Section 4 
can also be applied to random graphs, for example, to the isolated vertices 
resulting from the deletion with small probability of each of the edges of a 
connected graph, where the resulting isolated vertices may remain in their 
original positions or may be distributed independently and randomly in a 
carrier space. Since this random graph problem is similar in nature to that 
of rare words in DNA sequences, it will not be discussed further in this sec- 
tion. A special case of this problem which involves counting the number of 
isolated vertices has been considered by Roos (1994) and Eichelsbacher and 
Roos (1999). 

5.1. Matern hard-core process. Consider a Poisson number, with mean 
fx, of points placed independently and uniformly at random in T, where T is 
a compact subset of M. d with volume V(r) ^ 0. A Matern hard-core process 
3 is produced by deleting any point within distance r of another point, 
irrespective of whether the latter point has itself already been deleted [see 
Cox and Isham (1980), page 170]. More precisely, let {a' n } be a realization 
of points of the Poisson process. Then the points deleted are 

{«nl = {x€ {a' n } :\x-y\<r for some y^x,ye {a' n }}, 

and {a n } := {«'„} \ {a[(} constitutes a realization of the Matern hard-core 
process 5 [see Daley and Vere- Jones (1988)]. 

The Matern hard-core process is one of the hard-core processes introduced 
in statistical mechanics to model the distribution of particles with repulsive 
interactions [see Ruelle (1969), page 6]. It is a special case of the distance 
models [see Matern (1986), page 37] and is also a model for underdispersion 
[see Daley and Vere-Jones (1988), page 366]. 

Let X\, X2, ... be independent uniform random variables on T, and let 
A be a Poisson random variable with mean \i and independent of {Xi]i > 
1}. Then the Poisson process for the arrival points in T is Z = Y^f=i$Xi- 
Let B(x,r) = {y G T : < do(y,x) < r}, the r-neighborhood of x, where 
do(x,y) = \x — y\ A 1. Then the Matern hard-core process 3 can be writ- 
ten as 3 = J2fLi SxA{Z(B(x,,r))=o}- A1 so, 

N 

E(da) = ^5 Xl {d a ) 1 {Z{B{X x ,r))=0} = ^{z(B{a,r))=o} z {da). 
1=1 

Let Kd be the volume of the unit ball in R rf and let di be the second Wasser- 
stein metric generated from do as in Section 3. 
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Theorem 5.1. The mean measure of E is X(da) = e -A iV/ ( Q ' r )/ y ( r ) ^ x 
^(r) -1 da, and 

d 2 (£E,Po(A)) < 10?? + 6tf[3 + (1 - e~ 2 ' d ' d )i)\/{l + (1 - 2tf)/A), 
where V(a,r) is the volume of B(a,r) and •& = /iK^(2r) d /V(T). 

Proof. The Poisson property of Z implies that the counts of points in 
disjoint sets are independent. So 

X(da) =E(S(da)) = El {z(B{Qjr . ))=0} EZ(da) = e-» v{a > r)/v{v) ^(Ty 1 da. 

Also, whether a point outside B(a, 2r) U {a} is deleted or not is independent 
of the behavior of Z in B(a, r) U {a}. Hence, we choose A a = B(a, 2r) U {a} 
so that S is locally dependent with neighborhoods (A Q ;a £ T) and 

E |5(°fl>| + x 5 ^) 5 ^) = E | 5 (J)| + ^ W 5 ^' 
Applying Corollary 3.6 gives 

Now, 

E | H (a/3)| = f e -»rV(x,r)^ rdx ^ 



where T a p = F \ (A a U Ap) and = p/V(r). On the other hand, 
EZ(da)Z(dl3) 

e -MV(a,r)+V(P,r)) tJ 2 dad p ) jf | a _ /?| > 2r, 

e - Mr (V( a ,r)+VaJ,r)-V(a,Ar)) /i 2 ^ ^ if r < | Q _ ^ < ^ 

0, if 0< \a-/3\ <r, 

e'^^ ^ da, if a = (3, 

where V{a,(3,r) is the volume of B(a,r) fl5(j3,r). Hence, 

E[ | H (^)|2 ]=E / / ~(dx)Z(dy) 

J Jx,y& a/3 

e -HrV(x,r)^ dx 

+ [f e-^ v ^ +v ^fi 2 r dxdy 

+ ff e -A*r(^(*,r)+V(v,r)-V(w)) A4 2 da . dl/ _ 
J A,j/er Q/3 ,r<|x-y|<2r 
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Writing 



we have 



[E | H («/3)|]2 = / / e -/*r(V(*,r)+V(y,r)) /i 2 



Var(|H( Q/3 )|) = / e-^ vix ' r) ^rdx 



J Jx,yer af3 ,r<\x-y\<2r 

e -MV(x,r)+V(y,r))£ dxdy 

3>,yeT a/3 ,\x-y\<2r 
< / e -/TV(x,r) Mrda . 

a/3 



J Jx,j/Gr a(3 ,|x-j/|<2r 

< {1 + (1 - e- WKdrd )/ir^(2r) d } / e~^ v ^' r ^ r dx. 



Thus, 



E(|H(^)| + 1) " E(|3W»)|) " +( )Mr«dW , 

which, together with Lemma 3.5, yields 

E 1 < 2 + K < 3 + (1 - e-^^ d )^ T K d (2rY 



|H(a/?) | + 1 - e-T^t^^r rfx + 1 - A + 1 - 2n v n d (2r) d 
Finally, 

/ / EE(da)E(dp) < [ f e'^^ nfdadp < fi r K d (2r) d \ 

JaGT J/3eA a \{a} J aer J /3eA a 

and 

X(da)X(df3) < n T K d {2r) d \. 



aer J/3&A 

Applying these inequalities to the relevant terms in (5.1) gives Theorem 5.1. 

□ 



5.2. Occupancy problem. Suppose s balls are dropped independently into 
n urns with probability pk of going into the fcth urn. Two cases of the 
distribution of urns with given content have been studied in the literature. 
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They are urns with at most m balls (right-hand domain) and urns with at 
least m balls (left-hand domain), where m is a fixed nonnegative integer 
[see Kolchin, Sevast'yaixrv and Chistyakov (1978) and also Barbour, Hoist 
and Janson (1992), Chapter 6]. In this section, we consider the right-hand 
domain. So far, the focus in the literature has been on the total number 
of urns [see Arratia, Goldstein and Gordon (1989) and Barbour, Hoist and 
Janson (1992), and references therein] and little attention has been paid to 
the locations of the urns. 

We assume the urns are numbered from 1 to n and let Xi be the number 
of balls in the ith box, 1 < i < n. Define a point process E on Y = [0, 1] as 
follows: 

n 

^ = /j ^{Xi<m}fii/n- 



i=l 



The mean measure of H is then fi = Y!2=l n i^i/nj where 7Tj = J2^=o ( S j)Pi x 
(1 — Pi) s ~i and /i = J2i=i n i- Set X(dt) = niti dt for (i — l)/n < t < i/n, i = 
1,2, ... ,n and (^0(^1^2) = |*i — t 2 \ for tx, t 2 S Y. Let 

(jf = min V F(X k < m\Xi = Xj = 0) 

k=£i,] ,±<k<n 

and 

p" = min V ¥(Xj < m\Xi = 0) > p'. 

l<i<n — ' 

If smini<j< n pj is large, then we would expect good Poisson process approx- 
imation. 

Theorem 5.2. With the above setup, 
d 2 (CE,Po(\)) 

(5.2) <J_ + ^ + ^[ E (|S|)-Var(|H|)] 

2n V/i /// 

1 f s ( In s + m In In s + 5m 4 \ 2 

(5.3) < — + C 7T. + - _______ —-H + - 



2n I fi \ s — In s — m In In s — 4m s 
where (5.3) is valid for s > In s + m lnln s + 4m ; 

V 1 - 3p* /V 1" 
with 7r* = maxi<j< n 7Tj and = maxi<j<„pj < 1/3. 
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Proof. By the triangle inequality, (f 2 (£H, Po(A)) < d 2 (£E,Po(/Lt)) + 
c?2(Po(/i), Po(A)), so the term l/(2n) follows immediately from estimating 
d 2 (Po(/i),Po(A)) [see Brown and Xia (1995), (2.8)]. For each 1 < i < n, let 
Ii = l{Xi<m}) then (If, 1 < i < ri) are negatively related. Indeed, if Xi < m, 
take Yji = Xj for all j. If Xi > m, take a random variable X{ which is 
independent of {X±, . . . , X n } and has distribution £(Xi\Xi < m) and take 
Xi — Xi balls from urn i and redistribute them to the other urns with proba- 
bilities pj/(l — pi) for j 7^ i. Let Yji be the number of balls in urn j after the 
redistribution and set Jji = l{Y ri < m }- This coupling (Jji; 1 < j < n) satisfies 

£( Jji] l<j<n)= C(If l<j< n\Ii = 1), < Ij for all j ^ i 

[see Barbour, Hoist and Janson (1992), page 122]. 
We have from Corollary 4.6 that 

(5.4) d 2 (£E, Po(/x)) < E^ (- + v 3 . —r ) ( " + *? ) ■ 

Now, the above coupling can be modified to show that, for I > 1, 
(^7 ; J ^h,...,ii\X h = --- = X il =0) 



are also negatively related. In fact, denote i = (ix, . . . , i/). If = • • • = X^ 
0, take Zjj = Xj for all j 7^ i\ , . . . , i\ . If one of X^ , . . . , Xj ; is not 0, take all 
balls in urns ix, ■ ■ ■ ,ii and relocate them to the other urns with probabilities 
p'j := Pj/(1 — Pij — • • • — pi { ) for j / ii , . . . , %i . After the relocation, let Zj } be 
the number of balls in urn j and Jj; = t{z'..< m }- Next, for fc 7^ ix, . . . if 

Z' ki — m > take Jjki = ^ji- ^ -^fci > m ' take a random variable Z' ki which is 
independent of {Z' Vl , . . . , Z' ni } and has distribution C(Z' ki \Z' ki < m). Remove 
Z' ki — Z' ki balls from urn k and redistribute them to the other urns with 
probabilities p'j/(l — p'k), j 7^ k,ix, ■ ■ ■ After this, let Jj^ = l if there are 
at most m balls in urn j; otherwise, let Jj' fci = 0. These couplings satisfy 

£(Jji,j ^h,---,ii) =£(If,j^ii ) ...,ii\X il = --- = X k =0), 

£(Jjkhj =£(J' j i,j^ix,--. } ii\J'ki= !)> 

Jjki < Jji for all j ^ k,ix, ■ ■ ■ ,k, 

Jji<Ij for all j^ix,...,ii. 

In particular, if i = i, then Jjj < Jjj for j 7^ i. 

By these couplings and Proposition 4.5, we have 

(5.5) E— <E— 
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On the other hand, for k^i, denote k = (i, k), we have 
h — Jki 



E 



J2j^i Jji + 1 



<E 



h — Jki 



m s 

E E e 

2 1= Z 2 =m+1 



Hj^i,k ^{YjiKm} + 1 



P(x fc = ii l y w = i 2 ) 



<E 



E 



1 



m s 



,-tE E p(-y*=ii,y«=i 2 ) 

i 1= l 2 =m+l 

F(I k = l,J ki = 0) 



J2jj^i,k J'jk + 1 



< 



< 



£ Ejj k P(/ Jfe = l,J JM =0) 
P(4 = l,J fci = 0) E(/ fc -J w ) 



Hence, 



E 



which, combined with (5.4) and (5.5), yields 



d 2 (£H,Po( M ))<f- + ^)E^ 



i=i 



E ^ ~~ E ^ J ni + ^ 

. fc^i k^i J 



On the other hand, since for k ^ i, ~E{Jki)^i = P(/fc = li = 1) = E(lfcJj), 
we have 



E E 



E Ik ~ E ] ni + 7r * 



i=l L \ fc^i k^i / 

= E(4)E(Ji)- £ E(V fe ) 

l<i,fc<n i^k,l<i,k<n 

= E(|S|)-Var(|5|). 

Therefore, (5.2) follows. To prove (5.3), we note from Theorem 6.D of Bar- 
bour, Hoist and Jonson [(1992), page 122] that 



1 



Var(|~|) 



E(|S|) 



t^tt — < vr* + 



s ( In s + m In In s + 5m 
fi\s — Ins — m In In s — 4m 
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So, it remains to show that 

(5.6, £ <f 1 - 3 "- + 2 "-Vfl 

To prove (5.6), notice that, for 1 < i, j < n with i ^ j, 
J2 F(X k <m\X i = X j = 0) 

Pk Yfi Pk 



k^i,jO<l<m 



i J \i-Pi-PjJ V i-pi-pj 

s \jn „ ^-l/' 1 ~Pi-Pj-Pk 



> E E hia-^r' tt^w 

> G V 3 !; 2 ) s e e (;W-^r z 



Hence 



which implies (5.6). □ 

5.3. Rare words in biomolecular sequences. One of the important prob- 
lems in biomolecular sequence analysis is the study of the distribution of 
words in a DNA sequence. A DNA sequence may be regarded as a sequence 
of letters taken from the alphabet {A, C, G, T}. The letters A, C, G, T 
represent the four nucleotides: adenine, cytosine, guanine and thymine, re- 
spectively. They form two complementary pairs, namely {A, T} and {C, 
G}. 

It is known that repetition of a given word or a group of words or oc- 
currences of unusually large clusters of words are known to have biological 
functions. For example, unusually large clusters of palindromes are known 
to contain such significant sites as origins of replication and gene regulators. 
Here palindromes are symmetrical words of DNA in the sense that they read 
exactly the same as their reverse complementary sequences. In Leung and 
Yamashita (1999), palindromes of certain lengths are assumed to be inde- 
pendent and uniformly distributed in herpesvirus genomes, and the r-scan 
statistic is used to identify unusually large clusters of palindromes. 

It is commonly assumed that the bases of DNA are independent random 
variables taking values in the set {A, C, G, T}. Under this assumption, if 
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each word of a particular type is represented by a point, then the points rep- 
resenting these words form a locally dependent point process. Theorem 4.1 
in this paper provides an error bound for approximating such a point pro- 
cess by a Poisson process. The error bound can be used to find conditions 
for which the approximation is good. In general, the approximation is good 
if the words are rare in the sense that the probabilities of their occurrences 
are small. However, the error bound can be made more explicit only when 
the words are specified. 

As an application, Theorem 4.1 provides a mathematical basis for Poisson 
process modeling of rare words in a biomolecular sequence, and in partic- 
ular of palindromes in a DNA sequence. A consequence of this is that the 
observed rare words may be regarded as a realization of i.i.d. random vari- 
ables, thus providing a mathematical basis for the assumption in Leung and 
Yamashita (1999) that the points representing the palindromes are indepen- 
dent and uniformly distributed in the herpesvirus genomes. 

In Leung, Xia and Chen (2002) Poisson process approximation for palin- 
dromes in sixteen herpesvirus genomes is studied. The centers of palin- 
dromes in each herpesvirus genome are represented by the point process 



where the length of genome (number of base pairs) is denoted by M, those 
palindromes considered are of length at least 2L (the length must be even) 
and called 2L-palindromes, the center of a palindrome of length 2K is the 
Kth. base in the palindrome from the left, and the number of possible centers 
of 2L-palindromes is M — 2L + 1, denoted by n. Also, I{ is the indicator 
random variable for the occurrence of a 2L-palindrome centered at base 
% + L — 1 of the DNA sequence. The palindromes are represented by their 
centers because the latter are fixed irrespective of the lengths of the former, 
whereas the first base pair of a palindrome of at least a certain length is 
random and will give rise to complications in the analysis if it is used to 
represent the palindrome. 

Since 2L-palindromes with centers sufficiently far apart (more specifically, 
further than 2L — 1 bases apart) occur independently, the point process (5.7) 
is a sequence of marked locally dependent trials as described in Section 4 
of this paper, to which Theorem 4.1 is applicable. Here (If, 1 < i < n) are 
locally dependent with neighborhoods 

Ai = {j : i - 2L + 1 < j < i + 2L - 1} n {1, 2, . . . , n}, i = 1, 2, . . . , n. 

Take T = [0, 1] and do(x, y) = \x — y\. Let pi = F(I{ = 1) and pij = P(Ij = 
Ij = 1). It can be shown that pi = 9 L , where 9 = 2(paPt + PcPg) and pa, 
Pt, Pc, Pg are the probabilities of A, T, C, G, respectively. 



on {0, 1/n, 2/n, . . . , (n — l)/n, 1}: 



n 



(5.7) 




i=l 
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Suppose 

n 

(5.8) PA=PT , pc = PG and 4<L<— . 

Then the next theorem follows from Theorem 4.1 with U{ = i/n, Lemma 3.5 
and a two-step approximation as in Section 5.2; namely, first approximate 
3 by a Poisson process with the same mean measure as that of 3 and then 
approximate the latter by a Poisson process on [0, 1] with intensity measure 
Xdx. 

Theorem 5.3. We have 

26 1 

(5.9) d 2 (£H,Po(A)) < =-{h + b 2 ) + — < 131L0 L / 2 , 

A In 

where \ = E?=iPi = ne L , h = Yh=i ^jeA % PiPj ^ n (^ L ~ 1)# 2L , 

n 

&2 = £ £ Pij<n(4L- 2)0W 

i=lj€A it j^i 

and X(dx) = Xdx. 

Since a proof of Theorem 5.3 is given in Leung, Choi, Xia and Chen (2002), 
we will not give one here. It suffices to mention that the explicit bound on the 
overlap probabilities in (5.9) is due to the explicit nature of the palindrome. 
In order for Po(A) to be nondegenerate in the limit, A = n9 L must converge 
to a positive number as n — > oo. This means that L = Inn/ ln(l/0) + d, where 
d is bounded. For such an L, the assumption (5.8) is satisfied for sufficiently 
large n, Theorem 5.3 holds and the upper bound in (5.9) tends to as 
n — ► oo. 

A significant feature of the bound in (5.9) is that it has the Stein factor 
1/A. This is crucial for accuracy, as the value of A ranges from about 100 to 
300 for the sixteen herpesvirus genomes under study. 

In Leung, Choi, Xia and Chen (2002), a direct proof of a special case 
of Theorem 4.1 with Ui = i/n is given (see Theorem 1 and the Appendix). 
Also given are the details of deducing Theorem 5.3 from the special case 
of Theorem 4.1 and the proof of the upper bound Yi\LQ L l 2 (see Lemmas 1 
and 2 and Propositions 1 and 2). This upper bound is then used as a guide 
to choose optimal lengths of palindromes for the approximation. The scan 
statistics is then applied to identify unusually large clusters of palindromes 
for each of the sixteen herpesviruses. 
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